AITopics | vanilla model

Collaborating Authors

vanilla model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Neural Information Processing SystemsFeb-13-2026, 20:40:33 GMT

We introduce the Block Transformer which adopts hierarchical global-to-local modeling to autoregressive transformers to mitigate the inference bottlenecks associated with self-attention. Self-attention requires the key-value (KV) cache of all previous sequences to be retrieved from memory at every decoding step to retrieve context information, leading to two primary bottlenecks during batch inference. First, there is a significant delay in obtaining the first token, as the information of the entire prompt must first be processed to prefill the KV cache.

decoder, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)
Overview (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Inducing brain-relevant bias in natural language processing models

Dan Schwartz, Mariya Toneva, Leila Wehbe

Neural Information Processing SystemsFeb-11-2026, 18:36:38 GMT

Neural Information Processing Systems http://nips.cc/

brain activity, experiment participant, participant, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.77)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Country:

North America > United States > North Carolina (0.04)
North America > Canada (0.04)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Lin, Zichuan, Liu, Yicheng, Yang, Yang, Tao, Lvfang, Ye, Deheng

arXiv.org Artificial IntelligenceDec-5-2025

Vision-Language Models (VLMs) have achieved remarkable success in visual question answering tasks, but their reliance on large numbers of visual tokens introduces significant computational overhead. While existing efficient VLM approaches reduce visual tokens through fixed-ratio compression, they operate passively and lack the ability to adapt to varying task requirements. This motivates a fundamental question: Can VLMs autonomously determine the minimum number of visual tokens required for each sample? Inspired by human active vision mechanisms, we introduce AdaptVision, an efficient VLM paradigm that enables adaptive visual token acquisition through a coarse-to-fine approach. Our model initially processes compressed visual tokens from low-resolution images and selectively acquires additional visual information by invoking a bounding box tool to crop key regions when necessary. We train AdaptVision using a reinforcement learning framework that carefully balances accuracy and efficiency. Central to our approach is Decoupled Turn Policy Optimization (DTPO), which decouples the learning objective into two components: (1) tool learning, which optimizes correct tool utilization, and (2) accuracy improvement, which refines the generated responses to improve answer correctness. Based on this formulation, we further decouple advantage estimation by computing separate advantages for tokens associated with each objective. This formulation enables more effective optimization for AdaptVision compared to vanilla GRPO. Comprehensive experiments across multiple VQA benchmarks demonstrate that AdaptVision achieves superior performance while consuming substantially fewer visual tokens than state-of-the-art efficient VLM methods.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.03794

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

Zou, Zhengtao, Gao, Ya, Guan, Jiarui, Li, Bin, Marttinen, Pekka

arXiv.org Artificial IntelligenceNov-14-2025

Large Vision-Language Models (L VLMs) often suffer from object hallucination, generating text inconsistent with visual inputs, which can critically undermine their reliability. Existing inference-time interventions to mitigate this issue present a challenging trade-off: while methods that steer internal states or adjust output logits can be effective, they often incur substantial computational overhead, typically requiring extra forward passes. This efficiency bottleneck can limit their practicality for real-world, latency-sensitive deployments. In this work, we aim to address this trade-off with Residual-Update Directed DEcoding Regulation (RUDDER), a low-overhead framework that steers L VLMs towards visually-grounded generation. RUDDER is built on two key innovations: (1) Contextual Activation Residual Direction (CARD) vector, a per-sample visual evidence vector extracted from the residual update of a self-attention layer during a single, standard forward pass. Extensive experiments on key hallucination benchmarks, including POPE and CHAIR, indicate that RUDDER achieves performance comparable to state-of-the-art methods while introducing negligible computational latency, validating RUDDER as a pragmatic and effective approach for improving L VLMs' reliability without a significant compromise on efficiency. Code is available at https://anonymous.4open.science/r/ While Large Vision-Language Models (L VLMs) have shown remarkable capabilities in multimodal tasks and are increasingly deployed to assist with real-world problems (Alayrac et al., 2022; Liu et al., 2024a), their practical reliability is critically undermined by a persistent challenge: object hallucination. As shown in Figure 1, L VLMs frequently generate fluent, convincing text that is factually inconsistent with visual groundings, severely limiting their real-world utility and credibility (Ji et al., 2023).

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.10292

Country: Europe > Austria (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

In-Context Learning Without Copying

Sahin, Kerem, Feucht, Sheridan, Belfki, Adam, Brinkmann, Jannik, Mueller, Aaron, Bau, David, Wendler, Chris

arXiv.org Artificial IntelligenceNov-11-2025

Induction heads are attention heads that perform inductive copying by matching patterns from earlier context and copying their continuations verbatim. As models develop induction heads, they often experience a sharp drop in training loss, a phenomenon cited as evidence that induction heads may serve as a prerequisite for more complex in-context learning (ICL) capabilities. In this work, we ask whether transformers can still acquire ICL capabilities when inductive copying is suppressed. We propose Hapax, a setting where we omit the loss contribution of any token that can be correctly predicted by induction heads. Despite a significant reduction in inductive copying, performance on abstractive ICL tasks (i.e., tasks where the answer is not contained in the input context) remains comparable and surpasses the vanilla model on 13 of 21 tasks, even though 31.7\% of tokens are omitted from the loss. Furthermore, our model achieves lower loss values on token positions that cannot be predicted correctly by induction heads. Mechanistic analysis further shows that models trained with Hapax develop fewer and weaker induction heads but still preserve ICL capabilities. Taken together, our findings indicate that inductive copying is not essential for learning abstractive ICL mechanisms.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.05743

Country:

North America > United States (1.00)
Europe (0.93)
Asia (0.93)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

A Novel XAI-Enhanced Quantum Adversarial Networks for Velocity Dispersion Modeling in MaNGA Galaxies

Narkedimilli, Sathwik, Kumar, N V Saran, H, Aswath Babu, Vanahalli, Manjunath K, M, Manish, Jain, Vinija, Chadha, Aman

arXiv.org Artificial IntelligenceOct-29-2025

In the ever-evolving landscape of astrophysics and machine learning, understanding the internal kinematics of galaxies remains a formidable challenge. Traditional techniques for modeling galaxy dynamics have offered valuable insights but are often limited by their inability to capture complex, non-linear relationships in high-dimensional data. Recent advances in quantum computing and explainable artificial intelligence (XAI) provide new avenues for addressing these challenges, paving the way for more sophisticated and interpretable models in astrophysical research [19] [20] [21]. Galaxy velocity dispersion is a critical parameter that underpins our understanding of the mass distribution, dynamical state, and evolutionary history of galaxies. By analyzing detailed stellar population and kinematic properties--such as morphological classification, effective radius, and gradients in stellar age and metallicity, the prediction of velocity dispersion becomes central to characterizing the intricate interplay between a galaxy's structure and its dynamic behavior. The MaNGA dataset, with its rich set of 11 features, offers a robust platform for exploring these phenomena and highlights the technical demands of achieving accurate predictions in this domain [1].

data mining, machine learning, vanilla model, (18 more...)

arXiv.org Artificial Intelligence

2510.24598

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Filters

Collaborating Authors

vanilla model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Inducing brain-relevant bias in natural language processing models

baaa7b5b5bbaadca5023e1ab909b8af5-Paper-Conference.pdf

eddc3427c5d77843c2253f1e799fe933-AuthorFeedback.pdf

9e7ba617ad9e69b39bd0c29335b79629-Supplemental.pdf

35464c848f410e55a13bb9d78e7fddd0-Paper.pdf

AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

In-Context Learning Without Copying

A Novel XAI-Enhanced Quantum Adversarial Networks for Velocity Dispersion Modeling in MaNGA Galaxies